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Abstract. Oscillations lie at the core of many biological processes, from the 
cell cycle, to circadian oscillations and developmental processes. Time-keeping 
mechanisms are essential to enable organisms to adapt to varying conditions 
in environmental cycles, from day/night to seasonal. Transcriptional regula¬ 
tory networks are one of the mechanisms behind these biological oscillations. 

However, while identifying cyclically expressed genes from time series measure¬ 
ments is relatively easy, determining the structure of the interaction network 
underpinning the oscillation is a far more challenging problem. Here, we explic¬ 
itly leverage the oscillatory nature of the transcriptional signals and present a 
method for reconstructing network interactions tailored to this special but im¬ 
portant class of genetic circuits. Our method is based on projecting the signal 
onto a set of oscillatory basis functions using a Discrete Fourier Transform. We 
build a Bayesian Hierarchical model within a frequency domain linear model in 
order to enforce sparsity and incorporate prior knowledge about the network 
structure. Experiments on real and simulated data show that the method can 
lead to substantial improvements over competing approaches if the oscillatory 
assumption is met, and remains competitive also in cases it is not. 

1. Availability: 

DSS, experiment scripts and data are available at http://homepages.inf.ed.ac.uk/gsanguin/DSS.zip 
2. Contact: 


D.Trejo-Banos@sms.ed.ac.uk 

3. Introduction 

Cyclic behaviour is ubiquitous in biology. The importance of oscillatory sys¬ 
tems stems both from the necessity to adapt to the many environmental cycles 
(circadian, annual, etc.), as well as to maintain intrinsically periodic processes such 
as the cell cycle. Both of these type of oscillations are essential to many physi¬ 
ological processes, and malfunctions in the cellular time keeping mechanisms are 
frequently associated with disease, further motivating the study of these systems 
[Bell et gj, 2005] . 

Genetic regulatory networks are at the core of many of these biological oscilla¬ 
tors. These networks can sustain oscillatory behaviour in protein levels through 
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specific architectures involving multiple feedback loops of transcriptional regula¬ 
tion. For example, a transcriptional oscillator is thought to drive the Arabidopsis 
thaliana circadian clock through mutual repression of three transcriptional regu¬ 
lators [Pokhilko et ai, 2012[ [McClung, 2011| . The cell cycle is another oscillatory 
process, which controls cell division and duplication. In the case of Saccharomyces 
cerevisiae, experiments and dynamical models suggest that the cell cycle is the result 
of a transition between two self maintaining steady states, driven by two antagonis¬ 
tic classes of proteins [Chen et ai, 2004| . Evidence suggests that a transcriptional 
network is an important part of this mechanism [Spellman, 1998[ |Li et al, 2004[ 
[Orlando et al, 2008| . 

These oscillators have been the subject of study for many years, but uncovering 
the exact mechanism is a challenge that involve many complex chemical, genetic 
and physiological components. It is therefore important to devise computational 
statistical methods which may guide experimental analyses by inferring potential 
regulatory interactions directly from time series gene expression data, which is 
usually easier to obtain. 

Network inference is a well established and rich domain of research in systems 
biology. State of the art methods for regulatory network inference include a wide 
variety of techniques from statistics and machine learning. For example, mutual 
information between gene expression levels under different experimental conditions 
is used by ARACNE [Margolin et al., 2006 and CLR [Faith et al, 200 7|, two of the 
most widely used methods for network reconstruction. GENIES [Huynh-Thu et al, 2010| , 
another method which was a top performer at the DREAM network inference chal¬ 
lenges, uses random forests to produce a weighted ranking over the network edges. 

Other methods recently used include regularized regression [Haury et al, 2012 


ANOVA [Kuffner et al, 2012[ and Hierarchical Gaussian models [Li et al, 2006| 
Most of these methods focus on steady state data, which is by definition not avail¬ 
able for oscillatory networks. 

Regularisation-based and Bayesian methods can also be adapted to time series 
data. Dynamic Bayesian Networks have long been a popular choice in network in¬ 
ference (e.g. [Oates et ai, 2012] ). Such methods present considerable advantages in 
being able to quantify uncertainty and to incorporate prior knowledge, but are often 
severely limited by computational constraints. Optimisation-based methods based 
on regularised regression [Bonneau et al., 2006| present often a scalable alternative 
at the cost however of some modelling flexibility. 

Here, we use a first order model of the system dynamics to constrain the net¬ 
work inference, but we explicitly take advantage of the oscillatory behaviour of the 
system by pursuing frequency-based estimation. We build a hierarchical Bayesian 
model over the network dynamics which can set and infer structural constraints and 
account for the inevitable uncertainty that experimental settings convey. Further¬ 
more, our method can easily integrate non-trivial side information, for example in 
the form of sequence similarity between promoter sequence of genes. Experimental 
results on real and simulated data highlight that the method offers an effective and 
flexible platform for statistical inference in oscillatory systems, and can uncover 
non-trivial biological information. 

The rest of the paper is organised as follows: the next section describes the 
methodology we use, reviewing the linear time-invariant approximation we use as 
well as introducing the Bayesian hierarchical framework for network inference. We 
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then present an experimental evaluation on three data sets: a synthetic data set 
from the DREAM network inference challenge, a simulated data set obtained from 
a state of the art model of the A. thaliana circadian clock Pokhilko et ai, 2010| , 
and a real data set from the yeast S. cerevisiae cell cycle [Orlando et al, 2008| . We 
then conclude the paper by discussing our method in the light of these experimental 
results and the existing literature on network inference. 


4. Methods 

Our approach is centred on the assumption that the oscillatory dynamics of the 
regulatory network can be reasonably approximated, in Fourier space, by a linear 
time invariant system. This is of course a simplification, but it is not an unrea¬ 
sonable one, and has been previously proposed as a formalism to model oscillatory 
genetic circuits with considerable success, see jPalchau, 2011| for a recent review. 
From the inferential point of view, adopting a frequency domain perspective is 
convenient, as it enables us to transform the network reconstruction problem in a 
regression problem, for which many advanced estimation tools exist. We choose a 
Bayesian regression approach, as it provides an effective methodology to integrate 
diverse information in the inferential machine. As a proof of principle of how non¬ 
trivial information can be incorporated, we discuss how sequence similarity between 
promoter regions could be used within a hierarchical model framework. 


4.1. Linear time invariant model. The starting point for our modelling is the 
approximation of the system’s dynamics as a Linear Time Invariant (LTI) model: 


( 1 ) 


dxi (t) 
dt 


N 

^ ^ ^ij^j (f) T ^iXi (t) -|- ^ ^ ^ik^k- 

j^i k 


Here the expression level of gene i, denoted as Xi (t), depends on the expres¬ 
sion levels of the other TV — 1 genes (potential regulators) through activating or 
repressing intensity Oy S K. Gene expression levels decay linearly with rates A^. 
Additionally, gene expression depends on a set of K inputs Uk which can be either 
external signals (light for example) or any other gene signal that is not modelled 
explicitly in the network. Finally, each gene has a basal transcription rate 6^. 

Having a set of M samples from an experiment (e.g. mRNA levels from a 
microarray experiment), let the vector S denote the set of M expression 
level measurements for gene i. We can further construct the matrix X £ , 

which contains the sample points for the set of N genes. Let X be the derivative 
of A, so equation Q in matrix form for this set of gene expression levels is given 
by: 


(2) X =XA^ -b bl-f UC^ 

where A £ is the matrix with diagonal elements Xi and off-diagonal ele¬ 
ments aij, the input signals are contained in matrix U £ To complete the 

notation, we denote with b vector of basal expression levels, which multiplies the 
M X N matrix of ones 1 to add a constant term to the equation. 

We proceed to compute the derivative x by first projecting the gene expression 
levels into a set of orthogonal basis functions. The chosen set of basis functions 
is the one given by the Discrete Fourier Transform of the gene expression levels. 
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We emphasize that the choice of basis function is dictated by the nature of the 
problem: while in the limit of a continuously sampled signal this choice would be 
irrelevant (any complete basis would yield perfect reconstruction), for discretely 
sampled signals the quality of the approximation to the signal (and its derivative) 
will depend on the expressiveness of the chosen finite set of basis functions. Our 
choice of basis functions is motivated by the prior knowledge that the signals of 
interest should be oscillatory, making the choice to work in the frequency domain 
particularly appealing. We denote X (a;), X for brevity, as the frequency represen¬ 
tation of X, with each column containing the frequency spectrum of the expression 
of a gene over the time points. The frequency domain derivative can be computed 
analytically by X = 27rn;fX, so the frequency domain representation of the system 
is given by: 


(3) X =XA'^ -t- UC"^. 

Basal rates b are included in the zero frequency component of X. The frequency 
representation of the inputs is given by U. 

To account for any discrepancies between the linearised model and the true 
system dynamics, we assume normally distributed error with variance a^. The 
likelihood function for equation ^ is: 


( 4 ) 


N . .. 

p(x|X,A,U,C,az,) cx H 


Qi= Xi-[ X U 


a;^ 

cT 


2=1 

i \ T 


Xi-[ X u 
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In general, multiple replicate time series may be available. Denoting with K 
the number of replicate time series, the overall likelihood, under an assumption of 
normal i.i.d error between series, can be generalized as: 


(5) p({Xfc}|{Xfc}A,U,C,ar3) 


: nP(Xfc|Xfe,A,U,C,az5) 

fc=i 


which is a product of Gaussian densities. 

Notice that the form of equation ^ is identical to a regression problem where 
the output variables (Fourier coefficients of the derivatives of the signals) are re¬ 
gressed onto the Fourier coefficients of the signals. The inference problem of 
estimating the interaction and input response matrices [ A'^ C'^ ] in equa¬ 

tion can therefore be attacked using the vast repertoire of regression meth¬ 
ods. Regularized regression methods have been tested in a network inference 
context, see [Charbonnier et ai, 2010[[Bergersen et al, 20TTj [Bonneau et al, 2006[ 


|Haury et al., 2012 . Here, we opt for a hierarchical Bayesian approach, that will 
allow us to leverage prior knowledge and integrate other sources of information. 


4.2. Hierarchical Bayesian modelling. To interpret dynamical systems in a 
network perspective, we assume that the interaction matrix in our LTI representa¬ 
tion 0 has a sparse structure representing discrete interactions between regulators 
and target genes. We introduce the structural adjacency matrix H G RNxN^ 
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sits at the top of the hierarchy. This matrix contains elements hij = 1 if gene j 
regulates gene i for i ^ j. In this Bayesian approach, a sparsity inducing prior over 
elements of H is necessary to aid identifiability and interpret-ability. The prior 
form chosen for elements hij is a Bernoulli distribution, with parameter w which 
has a Beta distribution prior due to conjugacy. 

We chose a spike and slab prior to relate the connection matrix H and inter¬ 
action matrix A. This distribution consists of a mixture of a degenerate distri¬ 
bution and a long tailed distribution. The form chosen is derived from the one 
presented in [Ishwaran et al, 2005| , where the Uij elements are drawn from a scale- 
mixture model where a zero-mean normal distribution has variance governed by 
hyper-parameter t^-. In this form, the hyper-variance hijr^j has a continuous bi- 
modal distribution. With this prior, the posterior distribution of the less relevant 
parameters is shrunk towards zero and the non-zero elements are selected by the 
distributions tail. The advantage of the continuous distribution implied by the 
scale-mixture model of [Ishwaran et al., 2005| lies primarily in the fact that we 
avoid the need to parametrize these bimodal distributions manually. 

Thus, the hierarchical model is defined by equations: 


p({Xfc}|{XaA,C,U,az3) 

P [aij\hij, Tij) 

(6) P (hij\w) 

TT (w) 

IT (t-2) 

- 


= nLiP(Xfc|A,C,U,X,,a,,) 

N (0, %t2 ) 

~ (1 — w) 5.1,0 + 

^ Beta(ai,a2) 

~ Gamma (bi,b2) 

~ Gamma (ci, C2). 


The parameter ao accounts for uncertainty related to noise and model mismatch, 
for example arising from the linear approximation to the system dynamics. The 
parameter vO is introduced for numerical stability and is fixed to the value of 0.005. 
The hyperparameters 01 ^ 2 , bi ^2 and ci ^2 can be fixed to reflect prior beliefs, or set 
to vague values to reflect prior ignorance; in the rest of the paper they are set to 
the default values of (1, 1), (5 , 50) and (0.001,0.001) respectively. 


4.3. Sequence information integration. A major advantage of hierarchical mod¬ 
elling is the possibility of integrating different data sources. By branching from the 
top of the hierarchy, we can define models for different network related characteris¬ 
tics and keep all the information coupled by the top of the hierarchy. For example, 
protein interaction and binding data from GhIP-chip or GhIP-seq experiments can 
be used in a straightforward manner to modulate the prior probabilities over matrix 
H, for example by adjusting the parameter w for individual edges. 

Hierarchical models also allow us to exploit more subtle sources of structural in¬ 
formation derived from an analysis of sequence information. Transcription factors 
bind to the promoter region of their targets by recognizing specific motifs, short 
DNA words; thus co-regulated genes (genes that are regulated by a common tran¬ 
scription factor) should share common motifs in their promoted regions. We use 
this information to draw the basic model for our sequence integration approach. As 
the transcription binding sites share a common motif, we assume that the similarity 
between two promoter regions varies proportionally to the number of shared regu¬ 
lators. In this way, an observed pairwise similarity matrix S = [s^-] between gene 
promoters, derived from a multiple alignment method like jSievers et al, 2011| or 
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Figure 1. Hierarchical Bayesian model, on top of the hierarchy 
(green) lies the adjacency matrix H and sparsity parameter w. In 
chequered circles the frequency-domain gene expression model and 
its parameters. In yellow the stripes sequence similarity and its 
parameters. 


an alignment-free method [Bonham et ai, 2013| , can be related to the structural 
adjacency matrix at the top of the hierarchical model. Assuming for simplicity a 
Gaussian observation model, we can then incorporate sequence similarity by posit¬ 
ing the following relationship between promoter similarity scores and the structural 
adjacency matrix 


(7) p (s„ |H,/3, aseq) oc exp [ - 7 ^^ ( Sij - X! 

y \ 

Here the parameter {f3i} 1 < I < N is the similarity “induced” by the I — th 
transcription factor (a proportionality constant), and the product huhji equals 1 if 
and only if genes i and j are both regulated by 1. This model is a form of additive 
clustering [Mirkin, 1987| . By conditioning on H, we can derive the distribution 
p(/3i|...), which is a Gaussian with non-negative constraints, (see appendix eq. 4). 
This distribution can be used for sampling posterior values of /3; in our applications, 
however, we preferred to fix the value of /? to its non-negative maximum likelihood 
solution, effectively approximating this conditional posterior with a S function. 
The similarity score variance aseq is given a weakly informative inverse Gamma 
prior. By completing the square we can derive a Gaussian distribution for the betai 
parameters, for its derivation and estimation see appendix section 1. The overall 
structure of the model is depicted graphically in Fig. 

4.4. Inference. Inference of parameters {A, C, H, aD,w, r} is done through a sim¬ 
ple Gibbs sampling scheme. Given conjugacy among distributions, sampling of 
these parameters is straightforward for all distributions except p (/?/). This dis¬ 
tribution is not conjugate, so a Metropolis within Gibbs would be necessary for 
exact inference. In order to improve performance and given the fact that retrieving 
the distribution over f3i is not an objective; we use the non-negative least square 
estimate for the vector j3. Gonvergence was tested by applying Geweke diagnostic 
over the last 1000 samples of matrix H. Mathematical derivations of the required 
conditional posteriors and the general sampling algorithm are described in the ap¬ 
pendix. 
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5. Results 

In this section we assess the performance of our method on two realistic simulated 
data sets and a real data set, comparing its performance to two other state of the 
art methods. We call our method DSS, for DFT-based Spike and Slab model. The 
first simulated data set was generated from a well known model for the A. thaliana 
circadian clock network [Pokhilko et al, 2010| . This model is a non-linear ODE- 
based model which exhibits regular oscillations (for suitable parametrisations), thus 
matching one of our main modelling assumptions. However, it is a non-linear model, 
hence introducing an element of model mismatch. As a second synthetic bench¬ 
mark data set we used one of the data sets provided by the DREAM 4 challenge 
[Marbach et al, 2010| . This is again a non-linear model, which exhibits damped 
oscillatory dynamics in some of the nodes; thus, this data set presents considerably 
more elements of model mismatch. The last experiment tested the method on a 
real data set of gene expression levels obtained in a micro-array experiment for the 
S. cerevisae cell cycle transcriptional network [Orlando et al, 2008| . 

Results were assessed in terms of area under the Precision-Recall (AUPR) curve; 
PR curves plot the fraction of correctly called instances versus the ratio of true 
positives over true positives plus false negatives. An ideal classifier would give a 
AUPR of 1, while a random baseline would return the ratio of positives negatives. 
Inference of the models parameters was conducted by Gibbs Sampling from the 
model presented in (Fig. . In total, 5000 samples were obtained. The last 1000 
samples were selected and averaged to compute the conditional probability of a link 
p {hij = 1|-) given the model and the expression data, see appendix sections 1.1.1 
and 1.1.2 for details into the inputs and outputs of the program. 

5.1. Competing methods. As a first comparison, in order to establish the validity 
of our claim that frequency domain analysis is beneficial for oscillatory networks, 
we sought to compare our results with a complete analogue in time domain. To 
do this, we implemented a spline-based alternative to the DFT, using cubic splines 
interpolation as means of computing the time domain derivative, while the rest of 
the hierarchical model was left unchanged. As competing methods to assess the 
performance of DSS we selected GENIES [Huynh-Thu et al., 2010| , which is based 
on random forests, and the ODE-regression based Inferelator [Bonneau et al, 2006[ 
[Greenfield et al., 2013| . 

In a network of N genes, GENIES solves N regression problems by predicting, 
using random forests, the expression level of each gene as a function of the other N-1 
genes (putative regulators). Then the relative importance of each gene expression 
is evaluated and the putative gene interactions are ranked. GENIES was designed 
for steady state data, but time-series adaptation can be readily derived and was 
provided to us by one of the authors. 

The Inferelator estimates the parameters of an ODE system using regression with 
L 1-regularization over a finite element approximation of the derivative. The method 
has been extended [Greenfield et al, 201S| , with new functionalities to incorporate 
prior information over the network links, and to use alternative optimisation meth¬ 
ods for model selection, including the elastic-net (regularization over LI and L2 
norms) and Bayesian regression with best subset selection. 

Finally, as a simple baselines, we implemented a LI regularised version of the re¬ 
gression problem in equation ([^ , using the LASSO implementation [Tibshirani, 1994] . 
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5.2. A. thaliana circadian clock. As a first example we used data generated 
from a well known oscillatory network model, the A. thaliana circadian clock. 
The data consists of simulated niRNA measurements from the model found in 
[Pokhilko et ai, 2010| . This non-linear model has 7 transcription factors and 2 post 
transcriptional elements ZTL and LHYmod. In order to replicate experimental con¬ 
ditions, we assume that only mRNA data is available, so protein concentrations for 
the transcriptional and post-transcriptional elements are assumed unobserved. The 
transcription factors used for network inference are ’LHY ’TOCl ’PRR5 hy¬ 
pothetical gene ’Y ’GI ’PRR9’ and ’PRR7’, the post-transcriptional elements 
are not considered. A graphical representation of the model can be observed in 
(appendix Fig. 1). This model was simulated for 3 cycles obtaining 28 samples. 
The procedure was performed with a light/dark photo period of 12/12, 6/18, 8/16, 
18/6 and 20/4 hours which are represented in our model by binary input signals 
U. This design of our study is created to mimic a realistic experimental setting 
as in [Edwards et ai, 2010| ; the biological rationale for such design is that stim¬ 
ulating the system with these different inputs may tease out the contribution of 
the main drivers of the clock at different times of day. We also simulated knock¬ 
out mutants ATOCl, APRR7PRR9, ALHY and AGI by the same procedure as 
presented in [Pokhilko et al, 2010| with photo periods of 12/12 hours. These ex¬ 
periments amount to 14 time series; as these data are directly the outputs of an 
ODE model (without any additional noise) we define this idealised data set as 
the noiseless data set. An additional dataset of 14 time series was produced by 
adding Gaussian white noise with a Signal-to-noise (SNR) of 10. An example of 
the simulated expression levels is plotted in the upper left panel of Fig[^ 

Using the model specification as ground truth, we proceeded to draw the PR- 
curves for the different methods and computed the area under the PR-curve for all 
the resulting networks. These areas are plotted for the noiseless (simulated data 
without added noise) and noisy data in the upper right panel of Fig. For the case 
of noisy data, we generated 20 data sets by adding lOSNR-white-Gaussian noise 
to the noiseless data set. The simulated data has a very well defined oscillatory 
behaviour, as such the DSS method proved superior over the alternatives, as can 
be appreciated in the chart. The DSS method achieved an AUPR of 0.57 and 
mean 0.55 with std of 0.1 in the noisy case, being over the mean GENIES and 
Inferelator estimates with these later yielding less variance in their scores (0 variance 
in case of the Inferelator). The LASSO solution to the quadratic form also shows 
an improvement over both methods, even though DSS still has the best results. 
Interestingly, the spline solution achieves comparably good performance with DSS 
for the noiseless case, but under performs in the noisy case. We speculate that 
this may be due to the sensitivity of the spline estimate of the derivative to noise, 
confirming our intuition that adopting a set of basis functions well adapted to 
the problem may convey an advantage. It is intriguing that the method actually 
obtains a better performance on noisy data sets; we speculate that this may be due 
to the fact that adding noise alleviated the effects of model mismatch (resulting 
from the LTI approximation). Intuitively, in the absence of noise the attempts to 
fit non-linear data with a linear model could become more problematic. 

To test the effect of including side information, we simulated a between-gene 
similarity matrix by drawing f3i from a uniform distribution U(0.1,0.6) and using 
Equation In this case we notice an important improvement by observing an 
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Expression profiles for the A. lhaliana simulated wild type aUPR for the A. thaiiana circadian dock model 
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Figure 2. Top left are the simulated gene expression profiles for 
the wild type data set with SNR 100. Top right are the AUPR 
values for the 2 different noise levels. Bottom left is the true net¬ 
work topology, going from blue (regulators) to red (targets). Bot¬ 
tom right is the inferred network topology obtained by setting a 
threshold of 0.5 over the inferred matrix H 

increment in the AUPR to 0.68 in the noiseless case and a mean of 0.57 with std of 
0.1 for the noisy data. The principal objective of using this simple simulated simi¬ 
larity matrix was to confirm that structural information can be retrieved and used 
as aid for inference. By clustering the co-regulated elements we added additional 
structural constraints into the inference scheme. 

Finally we included a graphical representation of the true network (Fig. [^bottom 
left) and a network resulting from setting a threshold of 0.5 over the inferred matrix 
H (DSS -with-similarity, noiseless data) (Fig. [^bottom right). We notice that 
the 0.5 threshold, while reasonable, is still arbitrary and is used here only for the 
purposes of graphical visualisation. The full output from the method is a probability 
over the existence of edges, and can be better visualised as a heatmap, see appendix 
sections 1.2 and 2. Directed edges go from blue (regulators) to red (targets), black 
edges mean bidirectional regulation. As can be appreciated important features 
such as the bidirectional regulation between ’LHY’-’PRR7’ and ’LHY’-’PRR9’ are 
recovered. Errors are related to the roles of ’PRR7’ and ’PRR9’ regulating ’GP 
instead of ’TOCl’. This may be due to the method confounding the effects of 
’TOCl’ over these former elements as being closer to the expression patterns of GI. 
This difficulty discriminating between the roles of the ’PRR’ genes is also expressed 
by inferring the spurious bidirectional edge between ’PRR7’-’PRR9’. 

5.3. DREAM Challenge. As a second example, we considered a data set from 
the fourth edition of the DREAM competition [Marbach et al, 2010| .This data set is 
obtained from simulating a 10-node network, of which three nodes are input nodes; 
15 regulatory links are present. Three simulations were present, one with an ODE- 
based system, another one with a Stochastic Differential Equation (SDE) system 
and a third one with SDE-based system and added experimental noise. Five time 
series are provided for each system, a time series contains 21 samples. The network 
is subjected to a single node perturbation, which mathematically corresponds to a 
change in the basal expression parameter, so the mean expression level of the node 
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changes for half of the time points. The expression profiles for the set of ten genes 
in one time series is presented in (Fig. [^top left). This data set does not comply 
with the main assumption of the model (it shows irregular damped oscillations); 
we therefore expect performance not to be optimal, but it is still useful to evaluate 
comparatively the model under such a model mismatch scenario. 

Figure (Fig. shows a comparison of the area under the P-curve for the three 
simulated systems. Of these, DSS achieves better performance in the ODE-based 
simulation, by having an AUPR of 0.31, higher than the nearest best method (GE¬ 
NIES). Inferelator could not be executed on this data set due to numerical is¬ 
sues (some expression levels are exactly zero in this example). The performance 
improved for the SDE based simulation, by achieving an AUPR of 0.35, above 
inferelator’s 0.27. Slightly worse results were achieved for the SDE model with 
experimental noise, achieving an AUPR of 0.3. By simulating a sequence similarity 
matrix performance was improved for both ODE and SDE solutions. In the case of 
SDE the solution improved dramatically to 0.42. 

As in the previous experiment, the network and its inferred counterpart are 
presented in Fig [^bottom left and bottom right respectively. The inferred network 
is obtained by setting the threshold to 0.5 over the inferred adjacency matrix for the 
SDE data with added similarity matrix. As can be observed in the true network, 
nodes "Gl" and "GIO" are constant inputs. Node "G9" is subjected to perturbation 
for half the time points, thus its effect is propagated through the network by node 
"G5". 

In the inferred network we can observe some interesting characteristics. First, 
nodes "Gl" and "G9" are identified as input nodes, node "GIO" is incorrectly 
identified as an output only node. Node "G2" maintains its out-degree of 4 even 
though it’s regulators are not correctly identified. Nodes "G9" and "G5" are shown 
with increased in and out-degree, this may also be due to the confounding effects of 
the their "parent-son" relationship, specially considering that the perturbed "G9" 
node has the biggest amplitude of the gene expression profiles, as appreciated by 
the red curve in the top left plot in Fig. 


5.4. S. cerevisae cell cycle. For the last experiment we used a real time series 
data set collected during the S. cerevisiae cell cycle. Our evaluation is based 
on the genes identified by [Haase et al., 201^ [Orlando et al., 2008| a nd some of 
their interactions on the dynamical model found in [Ghen et al, 2004|. The main 
transcriptional elements selected were ’SWI5’, ’YHPU, ’SWI4’, ’FKHU, ’SIGl’, 
’AGE2’, ’YOXF, ’STBU, ’NRMU, ’WHI5’, ’FKH2’, ’MGMF, ’SWI6’, ’HGMU, 
’NDDl’ and ’MBPl’. Their putative regulations were extracted from literature 
{see appendix} for the putative network used as ground truth. 

The source for the gene expression data is [Orlando et al., 2008| , it contains 2 
wild type replicates and two mutant replicates (Ac/51,2, 3,4, 5, 6) each one con¬ 
taining 14 samples for each gene during approximately 2 cell cycles. Addition¬ 
ally, we downloaded promoter sequence information from [Zhu et al., 1999] for all 
the network elements. We then proceeded to use the multiple alignment software 
Glustal Omega [Sievers et al, 2011[ to obtain an alignment-based similarity matrix 
S between sequences. As an alternative way of encoding sequence information, 
an alignment-free similarity matrix S2 was built using the method described in 
[Sims et al., 200^ . 
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AUPR for the DREAM4 oscillating network data set 




Random 
I GENIE3 
I Inferelftlor 
I LASSO 
I Splines 
I Splines with S 

loss 

I DSS with S 


Figure 3. Top left is the expression profiles for the SDE model 
with experimental noise, node "G9" in red presents a perturbation 
over half the time points. Top right are the AUPR values for the 
three simulation models. Bottom left is the true network topology, 
from blue (regulators) to red (targets). Bottom right is the inferred 
network obtained by setting a threshold of 0.5 over the inferred 
matrix H 


We tested three subsets of data, one containing only the wild type expression 
profiles, other containing only the mutants expression levels, the last data set was 
the normalized concatenation of both. As an example of the observed gene ex¬ 
pression levels. Fig. |^top panel shows the gene expression levels for the wild type 
conditions. 

The AUPR from applying the various methods to this data are presented in Fig. 
[^bottom left panel. In this case DSS identifies the putative network well above the 
random baseline of 2.1 and above the competing methods. In the case of wildtypes 
the AUPR of DSS was of 0.24. In the mutant data sets the performance of DSS 
improves by including sequence similarity achieving an AUPR of 0.2607 and 0.2608 
for S and S2 respectively. The best overall performance was achieved by using the 
combined data set with sequence similarity matrix S2, resulting in an AUPR of 
0.267. 

The network in (Fig. bottom right is obtained by setting the threshold of 0.9 
to the inferred network from the combined wild type and mutant dataset with added 
similarity matrix. In this case FKHl has a central role in the inferred network, be¬ 
ing fully connected to the other elements. Even though this fully connected position 
is biologically implausible, it does reinforce the important role of FKHl in the cell 
cycle, e.g. its role in regulating the M-phase response |Kumar et ai, 2000] . An¬ 
other noticeable inferred link concerns the post transcriptional regulation of SWI6 
by WHI5p ITurner, 2012] ; this regulation was also considered as part of the ground 
truth network, as in the case of the yeast cell cycle transcriptional and post tran¬ 
scriptional regulations are intertwined [Haase et ai, 201^ . Also worth noticing the 
regulation of SWI6 by YOXl (member of the SBF complex) even though evidence 
suggests causality may be in the opposite direction [Venters, 201 Ij . SWI4 and SWI6 
form part of transcription factor complexes SBF and MBF, as such, their regula¬ 
tions may be confounded. This can be appreciated in the regulation of NRMl by 
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Wild type S. cerevisae expression level profiles 




simulation oata set 



Figure 4. Top wild type yeast expression profiles for the selected 
genes, bottom left AUPR for the three different data combinations, 
wild type, mutants, and both. Bottom right network obtained by 
setting a threshold of 0.9 over matrix H 


SWI4 in the inferred network, when in fact NRMl appears to be regulated by SWI6 
IDeJesus et ai, 2013 . The transcriptional activator NDDl is essential during the S- 
phase |Loy et al, 1999| , NDDlp along MCMlp bind to FKH2p [Haase et al, 2014| , 
this effect may be observable in the inferred network by directed edges from NDDl 
to YOXl and from YOXl to FKH2. 

By observing the AUPR plot we see that mutant data appears to be more infor¬ 
mative in this case than wild type, being only marginally inferior to the combined 
data set with similarity matrix. Part of the experimental design in selecting mu¬ 
tations in [Orlando et al, 2008| was aiming at attenuating the effects of the post- 
transcriptional elements of the cell cycle; the stronger performance of our method 
on the mutant data sets may be explained by this experimental design. 

Generally, the DSS solution will find the most relevant edges in the network to 
explain the observed dynamics, while the DSS with similarity method will find the 
most relevant solution that includes a grouping of the proposed edges according to 
the similarity matrix. So both results can be analysed separately and may offer 
additional insight over the whole network behaviour. With this purpose the six 
inferred networks and the putative ground truth are included in (appendix Fig. 3) 
for analysis. 


6. Discussion 

Inference of gene regulatory networks from expression data is one of the best 
studied problems in systems biology. Despite this considerable collective effort, 
the general problem remains ill-posed and, in the absence of extensive data sets 
and strong domain expertise, a solution to this problem remains elusive. In this 
light, it is of interest to consider more delimited problems which may be amenable 
to specialised but more effective solutions. Oscillatory systems present a prime 
example of such a problem: while they obviously constitute a specialised subset 
of regulatory networks, in our opinion they are sufficiently widespread to warrant 
tailor-made solutions. DSS couples a simplified mechanistic approach (LTI) with 
frequency domain information to provide such a method. LTI methods in the time 
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domain for A. thaliana with experimental data have been studied in [Dalchau, 2011| . 
Our results on the circadian clock simulation suggest that this frequency domain 
approach can indeed be fruitful when the model assumptions are reasonably met. 
As Results over the DREAM and S. cerevisae data sets suggest that the method 
can perform competitively with state of the art methods also when the model 
assumptions are not precisely met (damped oscillatory behaviour); however, in 
these cases the methods competitive advantage is smaller or inexistent. 

The use of derivative and ODE information in a network inference framework 
has some precedents. A method that is in spirit similar to our approach is Inferela- 
tor [Bonneau et al., 2006| . It casts the network inference problem as a a parameter 
inference problem over a first order differential equation system, then estimates the 
system parameters via regularized regression over a finite differences solution to the 
system. Recently Bayesian approaches that make use of the derivative information 
have also been proposed. In [Oates et al., 2012] a probabilistic model for integrat¬ 
ing a linearised version of network dynamics in a regression framework is presented. 
jPondelinger et al., 2013] attacked the problem of parameter inference of an ODE 
system jointly with a Bayesian regression over the gene expression levels. The basis 
of this model is a Gaussian process with product of experts likelihood, not dissimilar 
from our model in equation ([^. However, the authors in jPondelinger et al, 2013] 
did not attempt a joint parameter estimation and variable selection problem, stop¬ 
ping short of formulating the problem in terms of network inference. Basis functions 
in time domain (splines) have already been applied to network inference problems 
in systems biology, primarily to model unknown non-linear transition functions 
[Morrissey et al., 2011| . The distinctive part of our work is the proposal of a fre¬ 
quency domain approach for oscillatory systems, and in particular the embedding 
of our method within a hierarchical framework where integration of additional in¬ 
formation is natural. We expect that non-linearities encoded as basis functions 
as in [Morrissey et al, 201 Ij would be a valuable extension of our work and likely 
result in an improvement in performance. 

While we believe that the DSS method provides promising results, there are 
several inherent limitations in our approach. Importantly, the LTI approximation 
implies that self regulation is confounded with decay, so such types of interactions 
cannot be identified. Empirical results also seem to suggest that post transcriptional 
interactions may be confounded with transcriptional interactions; this is to be ex¬ 
pected, as post-transcriptional interactions are not modelled in our framework. For 
such reasons, direct application to models that include complex post-transcriptional 
interactions, such as [Pokhilko et al., 2012| , is not advised. Furthermore, as all 
Bayesian network inference methods, DSS also suffers from multi-modal posterior 
distributions. The use of auxiliary information, such as sequence similarity, can be 
beneficial to ameliorate this problem. Many different types of auxiliary informa¬ 
tion can be considered, and indeed alternative models for incorporating sequence 
similarity could also be used. A major strength of a Bayesian hierarchical model is 
that different models for auxiliary information could be easily incorporated within 
the DSS framework. 
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Appendix for A Bayesian approach for Structure learning of Oscillating Regula¬ 
tory Networks 

Appendix A. Gibbs sampler for the Hierarchical Bayesian model. 
Likelihood funcion for the LTI system. 
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Now by using the vectorization transformation for an arbitrary matrix M, such 
that M = vec (M) and properties 

Tr (MTn) = MTn 
vec (MN) = (I G M) N 

Then we have 
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then by using vectorization we write the prior in canonical form 
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p(B|H,t) oc exp|^--B"rB 
r = diag ) 

finally multiplying by the gaussian with hypervariance given by the spike and 
slab prior and completing the square we get 
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Spike and slab prior over the coefficients 
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having vectorS = representing the off-diagonal elements of the upper 

triangular matrix of S, vectorrepresenting the i — th row vector of H and o 
representing the elementwise product (Hadamard product). Then the distribution 
of the upper diagonal elements is: 
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P ~ P (S|H,/3, ajq) p {aj^) 
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Sampling (3 

Instead of sampling from p (/3;|.) we use the non-negative least square estimate 
of (3, by solving the quadratic form 
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Conditional distribution for hij 

We define matricesH^j* = [h^ o hj] such that hij = uO, and Hb = [L o hj] 
such that hij = 1. 
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A.l. Algorithm. The Algorithm presents the Gibbs sampler for parameter in¬ 
ference in the Hierarchical model. A brief overview of the inputs and outputs is 
also presented. 


A.1.1. Program Inputs. Due that the DFT computed by the FFT algorithm yields 
complex numbers, a real representation of the coefficients is needed. We work with 
the so-called Real Discrete Fourier Transform. It consists of stacking the real over 
the imaginary part of the first (M — l)/2 FFT coefficients. We denote this M x N 
matrix X. Figures (|^ and (|^ show plots for the DFT real and imaginary parts 
of the A. thaliana data-set, with a photoperiod of 12 hours. The magnitude and 
phase spectra is also shown, finally the RDFT is presented at the bottom of the 
picture. 

Hyperparameters ai and 02 are set to 1 so re is uniformly distributed in the inter¬ 
val [0,1]. Parameters hi and &2 are set so the hypervariance has a continues 
bi-modal distribution, according to the recommendations of [Ishwaran et ai, 2005| , 
in which they set them to 5 and 50 respectively. Alternative parametrization of 50 
and 500 was also tested yielding better results in some cases. 
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Figure 5. Spectra for the A. thaliana circadian clock simulation 
with a 12 hour photo-period 
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Figure 6. Derivative Spectra for the A. thaliana circadian clock 
simulation with a 12 hour photo-period 

The hyperparameters ci and C 2 for(T^^ are set to 0.001 and 0.001, this is a weak 
prior reflecting uncertainty about the linearity of the system. Hyperparamters di 
and c ?2 Eire set to 10 and 0.001, this parametrization required a manual tunning, as 
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the scale parameter crjgg having a weak prior resulted in the effects of the sequence 
similarity model to banish. By modifying this prior we can give more “weight” to 
the sequence similarity clustering thus, the flexibility of the Hierarchical model. 



12/12 photoperiod-ilght input 



6/18 photoperiod-ilght input 



18/6 photoperiod-light input 



Time 


Time 


Time 


Figure 7. Three examples of light input for the A. thaliana cir¬ 
cadian clock simulation. 


A.1.2. A. thaliana light inputs U. 


A.1.3. Output. The Gibbs sampler presented in the previous section allows us to 
draw samples from the joint conditional distribution p (H, A.,C, j3, w, r, ct/j, cts | {Xk}) 
The marginal distribution for each of the models parameters can be drawn from 
this joint distribution, and the expected value for each parameter equals to the 
average of the samples. 

For example, figure illustrates the expected value for matrix H obtained from 
averaging over 1000 samples drawn from the marginal distribution p(H|{Xk}). 
This figure shows in dark red those elements with higher probability of a regula¬ 
tory interaction under the model assumptions, except the diagonal elements, which 
represent the decay rates of the equation model. The AUPR were computed by 
thresholding the off-diagonal elements of this matrix for each data-set. 
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Algorithm 1 Algorithm for the DFT-based Spike and slab prior model with se¬ 
quence similarity. 

Inputs: K time series of M time-points for N gene expression levels, encoded in 
matrices {aife}. Prior hyper-parameters oi, 02 , 62 , ci, C 2 . Optional similarity 

matrix S. 

Outputs: Joint conditional posterior distribution p (H, A, C, /3, w, r, ajj, a^l {Xk}) 
(1) Obtain the DPT of {xk} and the corresponding RDFT coefficient matrices 
{Xk} 


{x.} 


( 2 ) 

(3) 

(4) 

(5) 

( 6 ) 

(7) 

( 8 ) 

(9) 

Note: A burn in period of 4000 samples is considered in the general purpose 
implementation of the model. 


Compute the derivatives 
Sample from the conditional distribution over the LTI coefficients, given in 

eq. ^ 

Sample from the conditional distribution overr“^ given by eq. (12 1 
Sample H from eq. (18), to account for the decay rates we set diagonal 
elements ha to 1, and set the diagonal elements of matrix A to negative. 
Sample w from eq. (13) 


Sample ao from eq. (14) 

OPTIONAL sample Useg from eq. (16) and /3 from the nonnegative least 
squares solution to equation 0 - 
Return to step 3 



Figure 8 . Heat-map representing the expected value for p(H|.) 
obtained by averaging the last 1000 samples. Rows represent tar¬ 
gets and columns regulators. The diagonal indicates the decay 
parameters A. 
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Appendix B. Heatmaps and PR curves for the experiments 



Posterior H for the 1QSNR dataset 



Posterior H with S for the 10SNR dataset 



Figure 9. Heatmaps representing the posterior probability for the 
A. thaliana circadian clock network 


precision-recall graph 



Figure 10. PR cnrves for the A. thaliana circadian clock network 
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Posterior H for the ODE dataset 



"G1 •G2’G3"G4"G5‘G6"G7"G8'G9'G10" 


Posterior H with S for ODE dataset 



“G1 "G2’G3"G4"G5‘G6"G7"G8’G9'G10" 


Posterior H for the SDE dataset 



"G1 •G2*G3"G4"GS‘G6"G7’G8*G9'G10" 
Posterior H for the NSDE dataset 



"G1 ■G2*G3“G4"GS‘G6"G7"G8*G9'G10" 


Posterior H with S for the SDE dataset 



"G1 •G2’G3"G4"G5‘G6"G7"G8'G9'G10" 
Posterior H with S for the NSDE dataset 
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Figure 11. Heatmaps representing the posterior probability for 
the Dream4 Challenge network 


precision-recall graph 
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Figure 12. PR curves for the DREAM4 challenge network 
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Posterior H for the Mutant dataset 



Posterior H with S for the Mutant dataset 








Figure 13. Heatmaps representing the posterior probability for 
the s. cerevisiae network 


precision-recall graph 



Appendix C. A. thaliana circadian clock 
The arabidopsis thaliana circadian clock model as presented in, is shown in Fig. 

m 
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Figure 15. A. thaliana circadian clock model, transcriptional ele¬ 
ments LHY, PRR9, PRR7, NI, Y, and TOCl. Post-transcriptional 
elements ZTL, TOClmod and LHYmod. Light input is repre¬ 
sented by a lighting symbol. Activating interactions are repre¬ 
sented by solid line with arrows, repression by solid line with rect¬ 
angles at the end, post transcriptional interactions are represented 
by dashed lines. . 


Appendix D. DREAM4 network 

The 10-node oscillatory network that was part of the DREAM4 challenge sup¬ 
plementary information data set is presented in Fig. |16| 


















OSCILLATORY-NETWORK LEARNING 


26 


G10 G9 



Figure 16. DREAM4 challenge network with 10 nodes, of those 
3 are inputs, node G9 was subjected to a perturbation for half the 
time points. 


Appendix E. S. cerevisae cell cycle network 


In Fig. 17 the inferred networks after thresholding the value of p {hij = 
presented, the putative ground truth matrix is presented on Fig. 
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Figure 17. Inferred yeast networks for different data subsets with 
and without sequence information, edges go from blue (regulators) 
to red (targets).. 
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Putative regulatory network siCI 



Figure 18. Putative yeast regulatory network edges go from blue 
(regulators) to red (targets).. 


The putative regulatory network was built based on these references. 
• Regulation of SICI by SWI5 as in |Weiss^_2^^ 


Regulation of SWI4 by YHP1 as in |Bahler, 2005| . 

Regulation of YHP 1, SWI4, YOXl and HCMl by SWI4 as in |Macisaac et al, 2006] . 
Regulation of SWI5 and ACE2 [Haase et al, 2014] ; YHPl [Macisaac et al, 2006| ; 


SICI, YOXl and HCMl [Venters et al, 2011|; NDDl [Ostrow et al, 2014 
by FKHl. 


Regulation of SWIG and MBPl by NRMl as in [Macisaac et al, 2006 


Regulation of SWIG, SWI4 and MBPl by WHI5 as in [Haase et al, 201^ . 


ACE2 and 


Regulation of SWI5, YHPl and FKHl [Macisaac et al, 2006 
NDDl [Haase et al, WL^ by FKH2. 

Regulation of SWI4 and SWI5 by MCMl as in [Macisaac et al, 2006] . 

Regulation of SWI4, FKHl, YOXl, NRMl, HCMl and NDDl as in |Macisaac et al, 2006|. 
Regulation of YHPl, FKHl, FKH2, WHI5 and NDDl by HCMl as in 
[Pramila et al, 2006] . 

Regulation of SWI5 and ACE2 [Haase et al, MTi] ; YHPl and HCMl as in 
[Macisaac et al, 2006|. 

Regulation of YHPl, FKHl, YOXl, NRMl and HCMl by MBPl as in 
[Macisaac et al, 2006| . 

Regulation of NDDl [Macisaac et al, 2006] ; SWIG and MBPl by the inter¬ 
action of transcription factor MBF with STBl [Stillman, 2013| . 

Regulation of SWI4 and SWIG by its cobinding with MCMlp as in [Haase et al, 2014] . 
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